BLEUÂTRE: Flattening Syntactic Dependencies for MT Evaluation
نویسندگان
چکیده
This paper describes a novel approach to syntactically-informed evaluation of machine translation (MT). Using a statistical, treebanktrained parser, we extract word-word dependencies from reference translations and then compile these dependencies into a representation that allows candidate translations to be evaluated by string comparisons, as is done in n-gram approaches to MT evaluation. This approach gains the benefit of syntactic analysis of the reference translations, but avoids the need to parse potentially noisy candidate translations. Preliminary experiments using 15,242 judgments of reference-candidate pairs from translations of Chinese newswire text show that the correlation of our approach with human judgments is only slightly lower than other reported results. With the addition of multiple reference translations, however, performance improves markedly. These results are encouraging, especially given that our system is a prototype and makes no essential use of synonymy, paraphrasing or inflectional morphological information, all of which would be easy to add.
منابع مشابه
Syntactic Features for Evaluation of Machine Translation
Automatic evaluation of machine translation, based on computing n-gram similarity between system output and human reference translations, has revolutionized the development of MT systems. We explore the use of syntactic information, including constituent labels and head-modifier dependencies, in computing similarity between output and reference. Our results show that adding syntactic informatio...
متن کاملLabelled Dependencies in Machine Translation Evaluation
We present a method for evaluating the quality of Machine Translation (MT) output, using labelled dependencies produced by a Lexical-Functional Grammar (LFG) parser. Our dependencybased method, in contrast to most popular string-based evaluation metrics, does not unfairly penalize perfectly valid syntactic variations in the translation, and the addition of WordNet provides a way to accommodate ...
متن کاملSEPIA: Surface Span Extension to Syntactic Dependency Precision-based MT Evaluation
We present a new Machine Translation (MT) evaluation metric, SEPIA. SEPIA falls within the class of syntactically-aware evaluation metrics, which have been getting a lot of attention recently (Liu and Gildea, 2005; Owczarzak et al., 2007; Giménez and Màrquez, 2007). Specifically, SEPIA uses dependency representation but extends it to include surface span as a factor in the evaluation score. The...
متن کاملCollocations in a Rule-Based MT System: A Case Study Evaluation of Their Translation Adequacy
Collocations constitute a subclass of multi-word expressions that are particularly problematic for machine translation, due 1) to their omnipresence in texts, and 2) to their morpho-syntactic properties, allowing virtually unlimited variation and leading to long-distance dependencies. Since existing MT systems incorporate mostly local information, these are arguably ill-suited for handling thos...
متن کاملShallow-Syntax Phrase-Based Translation: Joint versus Factored String-to-Chunk Models
This work extends phrase-based statistical MT (SMT) with shallow syntax dependencies. Two string-to-chunks translation models are proposed: a factored model, which augments phrase-based SMT with layered dependencies, and a joint model, that extends the phrase translation table with microtags, i.e. perword projections of chunk labels. Both rely on n-gram models of target sequences with different...
متن کامل